Overview

Brought to you by YData

Dataset statistics

Number of variables28
Number of observations46096
Missing cells43527
Missing cells (%)3.4%
Duplicate rows164
Duplicate rows (%)0.4%
Total size in memory67.4 MiB
Average record size in memory1.5 KiB

Variable types

Numeric8
Categorical17
Text3

Alerts

Dataset has 164 (0.4%) duplicate rowsDuplicates
Ano Letivo de Previsão de Conclusão is highly overall correlated with Ano de Ingresso and 1 other fieldsHigh correlation
Ano de Ingresso is highly overall correlated with Ano Letivo de Previsão de Conclusão and 1 other fieldsHigh correlation
Campus is highly overall correlated with Código CursoHigh correlation
Carga Horária Obrigatória Pendente is highly overall correlated with Ano Letivo de Previsão de Conclusão and 6 other fieldsHigh correlation
Carga Horária de Prática Profissional Pendente is highly overall correlated with Carga Horária Obrigatória Pendente and 4 other fieldsHigh correlation
Carga Horária de Seminário Pendente is highly overall correlated with Carga Horária Obrigatória Pendente and 4 other fieldsHigh correlation
Código Curso is highly overall correlated with CampusHigh correlation
Forma de Ingresso is highly overall correlated with Nível de Ensino and 1 other fieldsHigh correlation
Frequência no Período is highly overall correlated with classHigh correlation
I.R.A. is highly overall correlated with Percentual de Progresso and 1 other fieldsHigh correlation
Modalidade is highly overall correlated with Nível de Ensino and 1 other fieldsHigh correlation
Nível de Ensino is highly overall correlated with Forma de Ingresso and 2 other fieldsHigh correlation
Percentual de Progresso is highly overall correlated with Carga Horária Obrigatória Pendente and 6 other fieldsHigh correlation
Período Atual is highly overall correlated with Carga Horária Obrigatória Pendente and 5 other fieldsHigh correlation
Prática Profissional Pendente is highly overall correlated with Carga Horária Obrigatória Pendente and 4 other fieldsHigh correlation
Registro de TCC Pendente is highly overall correlated with Forma de Ingresso and 2 other fieldsHigh correlation
class is highly overall correlated with Frequência no Período and 1 other fieldsHigh correlation
Estado Civil is highly imbalanced (76.9%)Imbalance
Percentual de Progresso has 6063 (13.2%) missing valuesMissing
Renda Per Capita has 2228 (4.8%) missing valuesMissing
Prática Profissional Pendente has 6967 (15.1%) missing valuesMissing
Carga Horária de Prática Profissional Pendente has 6967 (15.1%) missing valuesMissing
Registro de TCC Pendente has 6969 (15.1%) missing valuesMissing
Carga Horária de Seminário Pendente has 7236 (15.7%) missing valuesMissing
Carga Horária Obrigatória Pendente has 6967 (15.1%) missing valuesMissing
Frequência no Período has 15139 (32.8%) zerosZeros
I.R.A. has 10770 (23.4%) zerosZeros
Percentual de Progresso has 4441 (9.6%) zerosZeros
Renda Per Capita has 2552 (5.5%) zerosZeros

Reproduction

Analysis started2024-11-22 18:49:36.421429
Analysis finished2024-11-22 18:49:53.405567
Duration16.98 seconds
Software versionydata-profiling vv4.9.0
Download configurationconfig.json

Variables

Ano Letivo de Previsão de Conclusão
Real number (ℝ)

HIGH CORRELATION 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2022.9457
Minimum2015
Maximum2028
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:53.519899image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum2015
5-th percentile2020
Q12021
median2023
Q32024
95-th percentile2026
Maximum2028
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.8585359
Coefficient of variation (CV)0.0009187275
Kurtosis-0.80535964
Mean2022.9457
Median Absolute Deviation (MAD)1
Skewness0.064941545
Sum93249707
Variance3.4541557
MonotonicityNot monotonic
2024-11-22T15:49:53.657868image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
2022 8281
18.0%
2023 8133
17.6%
2024 7588
16.5%
2021 7342
15.9%
2025 5401
11.7%
2026 4725
10.3%
2020 3575
7.8%
2019 665
 
1.4%
2027 287
 
0.6%
2028 64
 
0.1%
Other values (4) 35
 
0.1%
ValueCountFrequency (%)
2015 1
 
< 0.1%
2016 1
 
< 0.1%
2017 4
 
< 0.1%
2018 29
 
0.1%
2019 665
 
1.4%
2020 3575
7.8%
2021 7342
15.9%
2022 8281
18.0%
2023 8133
17.6%
2024 7588
16.5%
ValueCountFrequency (%)
2028 64
 
0.1%
2027 287
 
0.6%
2026 4725
10.3%
2025 5401
11.7%
2024 7588
16.5%
2023 8133
17.6%
2022 8281
18.0%
2021 7342
15.9%
2020 3575
7.8%
2019 665
 
1.4%

Ano de Ingresso
Real number (ℝ)

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2020.4369
Minimum2018
Maximum2023
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:53.777762image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum2018
5-th percentile2018
Q12019
median2020
Q32022
95-th percentile2023
Maximum2023
Range5
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.6815474
Coefficient of variation (CV)0.0008322692
Kurtosis-1.2393699
Mean2020.4369
Median Absolute Deviation (MAD)1
Skewness0.031654989
Sum93134058
Variance2.8276016
MonotonicityNot monotonic
2024-11-22T15:49:53.905951image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
2020 7915
17.2%
2019 7900
17.1%
2022 7868
17.1%
2018 7853
17.0%
2021 7836
17.0%
2023 6724
14.6%
ValueCountFrequency (%)
2018 7853
17.0%
2019 7900
17.1%
2020 7915
17.2%
2021 7836
17.0%
2022 7868
17.1%
2023 6724
14.6%
ValueCountFrequency (%)
2023 6724
14.6%
2022 7868
17.1%
2021 7836
17.0%
2020 7915
17.2%
2019 7900
17.1%
2018 7853
17.0%

Campus
Categorical

HIGH CORRELATION 

Distinct22
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.6 MiB
CNAT
11054 
MO
3059 
ZN
 
2216
PAR
 
2060
CAL
 
2052
Other values (17)
25655 

Length

Max length4
Median length2
Mean length2.7650772
Min length2

Characters and Unicode

Total characters127459
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAP
2nd rowAP
3rd rowAP
4th rowAP
5th rowAP

Common Values

ValueCountFrequency (%)
CNAT 11054
24.0%
MO 3059
 
6.6%
ZN 2216
 
4.8%
PAR 2060
 
4.5%
CAL 2052
 
4.5%
CA 2051
 
4.4%
ZL 2014
 
4.4%
SC 1926
 
4.2%
NC 1912
 
4.1%
IP 1877
 
4.1%
Other values (12) 15875
34.4%

Length

2024-11-22T15:49:54.052074image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cnat 11054
24.0%
mo 3059
 
6.6%
zn 2216
 
4.8%
par 2060
 
4.5%
cal 2052
 
4.5%
ca 2051
 
4.4%
zl 2014
 
4.4%
sc 1926
 
4.2%
nc 1912
 
4.1%
ip 1877
 
4.1%
Other values (12) 15875
34.4%

Most occurring characters

ValueCountFrequency (%)
C 26446
20.7%
A 24492
19.2%
N 18369
14.4%
P 11429
9.0%
T 11054
8.7%
S 6193
 
4.9%
M 5818
 
4.6%
L 5003
 
3.9%
Z 4230
 
3.3%
G 3248
 
2.5%
Other values (6) 11177
8.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 127459
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 26446
20.7%
A 24492
19.2%
N 18369
14.4%
P 11429
9.0%
T 11054
8.7%
S 6193
 
4.9%
M 5818
 
4.6%
L 5003
 
3.9%
Z 4230
 
3.3%
G 3248
 
2.5%
Other values (6) 11177
8.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 127459
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 26446
20.7%
A 24492
19.2%
N 18369
14.4%
P 11429
9.0%
T 11054
8.7%
S 6193
 
4.9%
M 5818
 
4.6%
L 5003
 
3.9%
Z 4230
 
3.3%
G 3248
 
2.5%
Other values (6) 11177
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 127459
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 26446
20.7%
A 24492
19.2%
N 18369
14.4%
P 11429
9.0%
T 11054
8.7%
S 6193
 
4.9%
M 5818
 
4.6%
L 5003
 
3.9%
Z 4230
 
3.3%
G 3248
 
2.5%
Other values (6) 11177
8.8%

Código Curso
Real number (ℝ)

HIGH CORRELATION 

Distinct180
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean8653.8624
Minimum1022
Maximum22437
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:54.193901image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum1022
5-th percentile1057
Q12025
median8079
Q314120
95-th percentile19401
Maximum22437
Range21415
Interquartile range (IQR)12095

Descriptive statistics

Standard deviation6434.8997
Coefficient of variation (CV)0.74358701
Kurtosis-1.2764533
Mean8653.8624
Median Absolute Deviation (MAD)6054
Skewness0.30342111
Sum3.9890844 × 108
Variance41407934
MonotonicityNot monotonic
2024-11-22T15:49:54.328577image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1051 903
 
2.0%
1057 834
 
1.8%
15121 757
 
1.6%
1436 577
 
1.3%
1101 502
 
1.1%
14401 499
 
1.1%
1405 498
 
1.1%
3116 488
 
1.1%
1301 481
 
1.0%
18101 479
 
1.0%
Other values (170) 40078
86.9%
ValueCountFrequency (%)
1022 425
0.9%
1025 399
0.9%
1045 472
1.0%
1051 903
2.0%
1057 834
1.8%
1059 95
 
0.2%
1101 502
1.1%
1102 131
 
0.3%
1104 319
 
0.7%
1110 261
 
0.6%
ValueCountFrequency (%)
22437 33
 
0.1%
21434 129
 
0.3%
21432 302
0.7%
21411 83
 
0.2%
21401 274
0.6%
20505 82
 
0.2%
20500 114
 
0.2%
20411 123
 
0.3%
20401 328
0.7%
20110 290
0.6%
Distinct180
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size10.2 MiB
2024-11-22T15:49:54.598386image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Length

Max length127
Median length101
Mean length80.049744
Min length22

Characters and Unicode

Total characters3689973
Distinct characters76
Distinct categories9 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowTécnico de Nivel Médio em Agropecuária, na Forma Subsequente [2012] - Campus Apodi
2nd rowTécnico de Nivel Médio em Informática, na Forma Integrado (2012) - Campus Apodi
3rd rowTécnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2014] - Campus Apodi
4th rowTécnico de Nível Médio em Química, na Forma Subsequente (2015) - Campus Apodi
5th rowTécnico de Nível Médio em Biocombustíveis, na Forma Integrado (2012) - Campus Apodi
ValueCountFrequency (%)
em 47053
 
8.0%
46379
 
7.9%
campus 42591
 
7.3%
de 37739
 
6.5%
na 34182
 
5.8%
técnico 33710
 
5.8%
médio 33509
 
5.7%
forma 33509
 
5.7%
2012 32684
 
5.6%
nível 30459
 
5.2%
Other values (161) 213226
36.4%
2024-11-22T15:49:54.975718image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
541844
 
14.7%
a 289329
 
7.8%
e 271431
 
7.4%
o 219037
 
5.9%
n 179195
 
4.9%
m 160552
 
4.4%
i 160204
 
4.3%
r 137989
 
3.7%
c 120389
 
3.3%
d 116781
 
3.2%
Other values (66) 1493222
40.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2412149
65.4%
Space Separator 541844
 
14.7%
Uppercase Letter 372727
 
10.1%
Decimal Number 181820
 
4.9%
Dash Punctuation 55607
 
1.5%
Close Punctuation 45866
 
1.2%
Open Punctuation 45866
 
1.2%
Other Punctuation 33999
 
0.9%
Nonspacing Mark 95
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 289329
12.0%
e 271431
11.3%
o 219037
 
9.1%
n 179195
 
7.4%
m 160552
 
6.7%
i 160204
 
6.6%
r 137989
 
5.7%
c 120389
 
5.0%
d 116781
 
4.8%
t 115883
 
4.8%
Other values (25) 641359
26.6%
Uppercase Letter
ValueCountFrequency (%)
C 71942
19.3%
N 49554
13.3%
M 48768
13.1%
T 42416
11.4%
F 35960
9.6%
I 34045
9.1%
S 23554
 
6.3%
A 14632
 
3.9%
E 12560
 
3.4%
P 11208
 
3.0%
Other values (11) 28088
 
7.5%
Decimal Number
ValueCountFrequency (%)
2 79449
43.7%
0 45311
24.9%
1 44706
24.6%
4 4359
 
2.4%
5 4348
 
2.4%
9 1446
 
0.8%
8 919
 
0.5%
3 691
 
0.4%
6 348
 
0.2%
7 243
 
0.1%
Other Punctuation
ValueCountFrequency (%)
, 33328
98.0%
. 576
 
1.7%
/ 95
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 29080
63.4%
] 16786
36.6%
Open Punctuation
ValueCountFrequency (%)
( 29080
63.4%
[ 16786
36.6%
Space Separator
ValueCountFrequency (%)
541844
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 55607
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̂ 95
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2784876
75.5%
Common 905002
 
24.5%
Inherited 95
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 289329
 
10.4%
e 271431
 
9.7%
o 219037
 
7.9%
n 179195
 
6.4%
m 160552
 
5.8%
i 160204
 
5.8%
r 137989
 
5.0%
c 120389
 
4.3%
d 116781
 
4.2%
t 115883
 
4.2%
Other values (46) 1014086
36.4%
Common
ValueCountFrequency (%)
541844
59.9%
2 79449
 
8.8%
- 55607
 
6.1%
0 45311
 
5.0%
1 44706
 
4.9%
, 33328
 
3.7%
) 29080
 
3.2%
( 29080
 
3.2%
[ 16786
 
1.9%
] 16786
 
1.9%
Other values (9) 13025
 
1.4%
Inherited
ValueCountFrequency (%)
̂ 95
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3525890
95.6%
None 163988
 
4.4%
Diacriticals 95
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
541844
15.4%
a 289329
 
8.2%
e 271431
 
7.7%
o 219037
 
6.2%
n 179195
 
5.1%
m 160552
 
4.6%
i 160204
 
4.5%
r 137989
 
3.9%
c 120389
 
3.4%
d 116781
 
3.3%
Other values (54) 1329139
37.7%
None
ValueCountFrequency (%)
é 70569
43.0%
í 34587
21.1%
á 14872
 
9.1%
ç 14830
 
9.0%
ã 13799
 
8.4%
ó 6006
 
3.7%
â 3664
 
2.2%
õ 3315
 
2.0%
ô 1318
 
0.8%
ú 670
 
0.4%
Diacriticals
ValueCountFrequency (%)
̂ 95
100.0%

Estado Civil
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing8
Missing (%)< 0.1%
Memory size2.9 MiB
Solteiro
42032 
Casado
 
2977
União Estável
 
571
Divorciado
 
367
Viúvo
 
141

Length

Max length13
Median length8
Mean length7.939507
Min length5

Characters and Unicode

Total characters365916
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSolteiro
2nd rowSolteiro
3rd rowSolteiro
4th rowCasado
5th rowSolteiro

Common Values

ValueCountFrequency (%)
Solteiro 42032
91.2%
Casado 2977
 
6.5%
União Estável 571
 
1.2%
Divorciado 367
 
0.8%
Viúvo 141
 
0.3%
(Missing) 8
 
< 0.1%

Length

2024-11-22T15:49:55.287499image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:55.382015image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
solteiro 42032
90.1%
casado 2977
 
6.4%
união 571
 
1.2%
estável 571
 
1.2%
divorciado 367
 
0.8%
viúvo 141
 
0.3%

Most occurring characters

ValueCountFrequency (%)
o 88487
24.2%
i 43478
11.9%
l 42603
11.6%
t 42603
11.6%
e 42603
11.6%
r 42399
11.6%
S 42032
11.5%
a 6321
 
1.7%
s 3548
 
1.0%
d 3344
 
0.9%
Other values (12) 8498
 
2.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 318686
87.1%
Uppercase Letter 46659
 
12.8%
Space Separator 571
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 88487
27.8%
i 43478
13.6%
l 42603
13.4%
t 42603
13.4%
e 42603
13.4%
r 42399
13.3%
a 6321
 
2.0%
s 3548
 
1.1%
d 3344
 
1.0%
v 1079
 
0.3%
Other values (5) 2221
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
S 42032
90.1%
C 2977
 
6.4%
E 571
 
1.2%
U 571
 
1.2%
D 367
 
0.8%
V 141
 
0.3%
Space Separator
ValueCountFrequency (%)
571
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 365345
99.8%
Common 571
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 88487
24.2%
i 43478
11.9%
l 42603
11.7%
t 42603
11.7%
e 42603
11.7%
r 42399
11.6%
S 42032
11.5%
a 6321
 
1.7%
s 3548
 
1.0%
d 3344
 
0.9%
Other values (11) 7927
 
2.2%
Common
ValueCountFrequency (%)
571
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 364633
99.6%
None 1283
 
0.4%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 88487
24.3%
i 43478
11.9%
l 42603
11.7%
t 42603
11.7%
e 42603
11.7%
r 42399
11.6%
S 42032
11.5%
a 6321
 
1.7%
s 3548
 
1.0%
d 3344
 
0.9%
Other values (9) 7215
 
2.0%
None
ValueCountFrequency (%)
á 571
44.5%
ã 571
44.5%
ú 141
 
11.0%

Etnia/Raça
Categorical

Distinct6
Distinct (%)< 0.1%
Missing107
Missing (%)0.2%
Memory size2.8 MiB
Parda
25753 
Branca
14516 
Preta
3836 
Não declarado
 
1445
Amarela
 
305

Length

Max length13
Median length5
Mean length5.5890104
Min length5

Characters and Unicode

Total characters257033
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowParda
2nd rowBranca
3rd rowBranca
4th rowParda
5th rowParda

Common Values

ValueCountFrequency (%)
Parda 25753
55.9%
Branca 14516
31.5%
Preta 3836
 
8.3%
Não declarado 1445
 
3.1%
Amarela 305
 
0.7%
Indígena 134
 
0.3%
(Missing) 107
 
0.2%

Length

2024-11-22T15:49:55.497182image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:55.634626image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
parda 25753
54.3%
branca 14516
30.6%
preta 3836
 
8.1%
não 1445
 
3.0%
declarado 1445
 
3.0%
amarela 305
 
0.6%
indígena 134
 
0.3%

Most occurring characters

ValueCountFrequency (%)
a 88008
34.2%
r 45855
17.8%
P 29589
 
11.5%
d 28777
 
11.2%
c 15961
 
6.2%
n 14784
 
5.8%
B 14516
 
5.6%
e 5720
 
2.2%
t 3836
 
1.5%
o 2890
 
1.1%
Other values (9) 7097
 
2.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 209599
81.5%
Uppercase Letter 45989
 
17.9%
Space Separator 1445
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 88008
42.0%
r 45855
21.9%
d 28777
 
13.7%
c 15961
 
7.6%
n 14784
 
7.1%
e 5720
 
2.7%
t 3836
 
1.8%
o 2890
 
1.4%
l 1750
 
0.8%
ã 1445
 
0.7%
Other values (3) 573
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
P 29589
64.3%
B 14516
31.6%
N 1445
 
3.1%
A 305
 
0.7%
I 134
 
0.3%
Space Separator
ValueCountFrequency (%)
1445
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 255588
99.4%
Common 1445
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 88008
34.4%
r 45855
17.9%
P 29589
 
11.6%
d 28777
 
11.3%
c 15961
 
6.2%
n 14784
 
5.8%
B 14516
 
5.7%
e 5720
 
2.2%
t 3836
 
1.5%
o 2890
 
1.1%
Other values (8) 5652
 
2.2%
Common
ValueCountFrequency (%)
1445
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 255454
99.4%
None 1579
 
0.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 88008
34.5%
r 45855
18.0%
P 29589
 
11.6%
d 28777
 
11.3%
c 15961
 
6.2%
n 14784
 
5.8%
B 14516
 
5.7%
e 5720
 
2.2%
t 3836
 
1.5%
o 2890
 
1.1%
Other values (7) 5518
 
2.2%
None
ValueCountFrequency (%)
ã 1445
91.5%
í 134
 
8.5%

Forma de Ingresso
Categorical

HIGH CORRELATION 

Distinct28
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
Ampla Concorrência
24718 
L6 - Qualquer Renda / Autodeclarado PPI
5407 
L2 - Renda <= 1,5 / Autodeclarados PPI
5173 
L5 - Qualquer Renda / Qualquer Etnia
 
2249
L1 - Renda <= 1,5 / Qualquer Etnia
 
2233
Other values (23)
6316 

Length

Max length60
Median length18
Mean length27.373199
Min length8

Characters and Unicode

Total characters1261795
Distinct characters57
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAmpla Concorrência
2nd rowAmpla Concorrência
3rd rowAmpla Concorrência
4th rowAmpla Concorrência
5th rowL6 - Qualquer Renda / Autodeclarado PPI

Common Values

ValueCountFrequency (%)
Ampla Concorrência 24718
53.6%
L6 - Qualquer Renda / Autodeclarado PPI 5407
 
11.7%
L2 - Renda <= 1,5 / Autodeclarados PPI 5173
 
11.2%
L5 - Qualquer Renda / Qualquer Etnia 2249
 
4.9%
L1 - Renda <= 1,5 / Qualquer Etnia 2233
 
4.8%
L2 - Renda <= 1,5 / Autodeclarados PPI (SISU) 1391
 
3.0%
L6 - Qualquer Renda / Autodeclarado PPI (SISU) 1313
 
2.8%
Transferência Facultativa 733
 
1.6%
L1 - Renda <= 1,5 / Qualquer Etnia (SISU) 641
 
1.4%
L5 - Qualquer Renda / Qualquer Etnia (SISU) 598
 
1.3%
Other values (18) 1640
 
3.6%

Length

2024-11-22T15:49:55.805600image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
50069
24.1%
ampla 24718
11.9%
concorrência 24718
11.9%
renda 19378
 
9.3%
qualquer 16446
 
7.9%
ppi 13422
 
6.5%
1,5 9572
 
4.6%
autodeclarado 6796
 
3.3%
l6 6720
 
3.2%
autodeclarados 6628
 
3.2%
Other values (32) 28934
14.0%

Most occurring characters

ValueCountFrequency (%)
161305
 
12.8%
a 122868
 
9.7%
r 82170
 
6.5%
n 78098
 
6.2%
o 77722
 
6.2%
c 66259
 
5.3%
l 56190
 
4.5%
e 55269
 
4.4%
u 47205
 
3.7%
d 46949
 
3.7%
Other values (47) 467760
37.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 795737
63.1%
Uppercase Letter 186773
 
14.8%
Space Separator 161305
 
12.8%
Decimal Number 40191
 
3.2%
Other Punctuation 30009
 
2.4%
Dash Punctuation 20062
 
1.6%
Math Symbol 19144
 
1.5%
Open Punctuation 4287
 
0.3%
Close Punctuation 4287
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 122868
15.4%
r 82170
10.3%
n 78098
9.8%
o 77722
9.8%
c 66259
8.3%
l 56190
7.1%
e 55269
6.9%
u 47205
 
5.9%
d 46949
 
5.9%
i 35641
 
4.5%
Other values (15) 127366
16.0%
Uppercase Letter
ValueCountFrequency (%)
A 38142
20.4%
P 26847
14.4%
C 24973
13.4%
L 20062
10.7%
R 19619
10.5%
I 17759
9.5%
Q 16446
8.8%
S 8605
 
4.6%
E 6640
 
3.6%
U 4287
 
2.3%
Other values (6) 3393
 
1.8%
Decimal Number
ValueCountFrequency (%)
1 13431
33.4%
5 13103
32.6%
6 6720
16.7%
2 6564
16.3%
3 163
 
0.4%
4 76
 
0.2%
9 72
 
0.2%
0 62
 
0.2%
Other Punctuation
ValueCountFrequency (%)
/ 20435
68.1%
, 9574
31.9%
Math Symbol
ValueCountFrequency (%)
= 9572
50.0%
< 9572
50.0%
Space Separator
ValueCountFrequency (%)
161305
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20062
100.0%
Open Punctuation
ValueCountFrequency (%)
( 4287
100.0%
Close Punctuation
ValueCountFrequency (%)
) 4287
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 982510
77.9%
Common 279285
 
22.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 122868
 
12.5%
r 82170
 
8.4%
n 78098
 
7.9%
o 77722
 
7.9%
c 66259
 
6.7%
l 56190
 
5.7%
e 55269
 
5.6%
u 47205
 
4.8%
d 46949
 
4.8%
A 38142
 
3.9%
Other values (31) 311638
31.7%
Common
ValueCountFrequency (%)
161305
57.8%
/ 20435
 
7.3%
- 20062
 
7.2%
1 13431
 
4.8%
5 13103
 
4.7%
, 9574
 
3.4%
= 9572
 
3.4%
< 9572
 
3.4%
6 6720
 
2.4%
2 6564
 
2.4%
Other values (6) 8947
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1235808
97.9%
None 25987
 
2.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
161305
 
13.1%
a 122868
 
9.9%
r 82170
 
6.6%
n 78098
 
6.3%
o 77722
 
6.3%
c 66259
 
5.4%
l 56190
 
4.5%
e 55269
 
4.5%
u 47205
 
3.8%
d 46949
 
3.8%
Other values (41) 441773
35.7%
None
ValueCountFrequency (%)
ê 25706
98.9%
ó 122
 
0.5%
ã 57
 
0.2%
â 50
 
0.2%
ç 50
 
0.2%
í 2
 
< 0.1%

Frequência no Período
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct3939
Distinct (%)8.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean61.702882
Minimum0
Maximum100
Zeros15139
Zeros (%)32.8%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:55.966976image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median90.97
Q398.1
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)98.1

Descriptive statistics

Standard deviation44.54313
Coefficient of variation (CV)0.72189707
Kurtosis-1.5595421
Mean61.702882
Median Absolute Deviation (MAD)9.03
Skewness-0.59849979
Sum2844256
Variance1984.0904
MonotonicityNot monotonic
2024-11-22T15:49:56.131897image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 15139
32.8%
100 7343
 
15.9%
95 60
 
0.1%
99.67 54
 
0.1%
98 53
 
0.1%
99.5 52
 
0.1%
90 52
 
0.1%
99 49
 
0.1%
99.64 47
 
0.1%
99.34 41
 
0.1%
Other values (3929) 23206
50.3%
ValueCountFrequency (%)
0 15139
32.8%
2.46 1
 
< 0.1%
3.27 1
 
< 0.1%
4.83 1
 
< 0.1%
5 4
 
< 0.1%
5.55 1
 
< 0.1%
6.47 1
 
< 0.1%
6.66 1
 
< 0.1%
7.5 1
 
< 0.1%
7.61 1
 
< 0.1%
ValueCountFrequency (%)
100 7343
15.9%
99.94 4
 
< 0.1%
99.92 8
 
< 0.1%
99.91 11
 
< 0.1%
99.9 4
 
< 0.1%
99.88 5
 
< 0.1%
99.87 2
 
< 0.1%
99.86 1
 
< 0.1%
99.85 7
 
< 0.1%
99.84 13
 
< 0.1%

I.R.A.
Real number (ℝ)

HIGH CORRELATION  ZEROS 

Distinct7508
Distinct (%)16.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.617633
Minimum0
Maximum100
Zeros10770
Zeros (%)23.4%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:56.271514image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q14.695
median71
Q382.19
95-th percentile91.1
Maximum100
Range100
Interquartile range (IQR)77.495

Descriptive statistics

Standard deviation35.329554
Coefficient of variation (CV)0.65891671
Kurtosis-1.3216253
Mean53.617633
Median Absolute Deviation (MAD)16.03
Skewness-0.61292788
Sum2471558.4
Variance1248.1774
MonotonicityNot monotonic
2024-11-22T15:49:56.413867image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 10770
 
23.4%
100 131
 
0.3%
70 82
 
0.2%
60 60
 
0.1%
80 57
 
0.1%
76 43
 
0.1%
85 43
 
0.1%
78 36
 
0.1%
90 35
 
0.1%
84 32
 
0.1%
Other values (7498) 34807
75.5%
ValueCountFrequency (%)
0 10770
23.4%
0.1 2
 
< 0.1%
0.11 2
 
< 0.1%
0.13 1
 
< 0.1%
0.17 1
 
< 0.1%
0.18 2
 
< 0.1%
0.19 10
 
< 0.1%
0.2 5
 
< 0.1%
0.21 2
 
< 0.1%
0.22 1
 
< 0.1%
ValueCountFrequency (%)
100 131
0.3%
99.84 1
 
< 0.1%
99.67 1
 
< 0.1%
99.53 1
 
< 0.1%
99.45 1
 
< 0.1%
99.4 1
 
< 0.1%
99.19 1
 
< 0.1%
99.11 1
 
< 0.1%
99.08 1
 
< 0.1%
99.06 1
 
< 0.1%

Matriz
Text

Distinct131
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size7.4 MiB
2024-11-22T15:49:56.727146image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Length

Max length113
Median length79
Mean length48.639947
Min length22

Characters and Unicode

Total characters2242107
Distinct characters74
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st row136 - Técnico Subsequente em Agropecuária (2012)
2nd row106 - Técnico Integrado em Informática (2012)
3rd row148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)
4th row152 - Técnico Subsequente em Química (2012)
5th row180 - Técnico Integrado em Biocombustíveis (2012)
ValueCountFrequency (%)
47517
 
13.4%
em 46826
 
13.2%
técnico 33022
 
9.3%
2012 30892
 
8.7%
integrado 19718
 
5.6%
subsequente 13319
 
3.8%
informática 7541
 
2.1%
tecnologia 6781
 
1.9%
de 5183
 
1.5%
licenciatura 4616
 
1.3%
Other values (265) 138477
39.1%
2024-11-22T15:49:57.200086image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
310815
 
13.9%
e 181918
 
8.1%
n 121119
 
5.4%
o 119630
 
5.3%
c 117167
 
5.2%
i 109475
 
4.9%
a 103117
 
4.6%
2 95368
 
4.3%
t 84755
 
3.8%
1 77937
 
3.5%
Other values (64) 920806
41.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1315633
58.7%
Decimal Number 315817
 
14.1%
Space Separator 310815
 
13.9%
Uppercase Letter 159691
 
7.1%
Dash Punctuation 47569
 
2.1%
Close Punctuation 45386
 
2.0%
Open Punctuation 45349
 
2.0%
Other Punctuation 1847
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 181918
13.8%
n 121119
9.2%
o 119630
9.1%
c 117167
8.9%
i 109475
 
8.3%
a 103117
 
7.8%
t 84755
 
6.4%
m 75244
 
5.7%
r 66650
 
5.1%
u 46969
 
3.6%
Other values (25) 289589
22.0%
Uppercase Letter
ValueCountFrequency (%)
T 42266
26.5%
I 32098
20.1%
S 19298
12.1%
E 12279
 
7.7%
A 11557
 
7.2%
M 9747
 
6.1%
L 5931
 
3.7%
C 4769
 
3.0%
G 4023
 
2.5%
D 3539
 
2.2%
Other values (10) 14184
 
8.9%
Decimal Number
ValueCountFrequency (%)
2 95368
30.2%
1 77937
24.7%
0 60396
19.1%
6 15791
 
5.0%
3 13428
 
4.3%
4 13404
 
4.2%
9 11658
 
3.7%
8 11628
 
3.7%
5 9539
 
3.0%
7 6668
 
2.1%
Other Punctuation
ValueCountFrequency (%)
, 1502
81.3%
. 250
 
13.5%
/ 95
 
5.1%
Close Punctuation
ValueCountFrequency (%)
) 35594
78.4%
] 9792
 
21.6%
Open Punctuation
ValueCountFrequency (%)
( 35557
78.4%
[ 9792
 
21.6%
Space Separator
ValueCountFrequency (%)
310815
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 47569
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1475324
65.8%
Common 766783
34.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 181918
12.3%
n 121119
 
8.2%
o 119630
 
8.1%
c 117167
 
7.9%
i 109475
 
7.4%
a 103117
 
7.0%
t 84755
 
5.7%
m 75244
 
5.1%
r 66650
 
4.5%
u 46969
 
3.2%
Other values (45) 449280
30.5%
Common
ValueCountFrequency (%)
310815
40.5%
2 95368
 
12.4%
1 77937
 
10.2%
0 60396
 
7.9%
- 47569
 
6.2%
) 35594
 
4.6%
( 35557
 
4.6%
6 15791
 
2.1%
3 13428
 
1.8%
4 13404
 
1.7%
Other values (9) 60924
 
7.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2158890
96.3%
None 83217
 
3.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
310815
 
14.4%
e 181918
 
8.4%
n 121119
 
5.6%
o 119630
 
5.5%
c 117167
 
5.4%
i 109475
 
5.1%
a 103117
 
4.8%
2 95368
 
4.4%
t 84755
 
3.9%
1 77937
 
3.6%
Other values (53) 837589
38.8%
None
ValueCountFrequency (%)
é 37160
44.7%
á 11838
 
14.2%
ç 11404
 
13.7%
ã 9263
 
11.1%
í 4916
 
5.9%
õ 3315
 
4.0%
â 1984
 
2.4%
ô 1318
 
1.6%
ó 896
 
1.1%
ú 670
 
0.8%

Modalidade
Categorical

HIGH CORRELATION 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.4 MiB
Técnico Integrado
19718 
Técnico Subsequente
13319 
Tecnologia
6781 
Licenciatura
4616 
Engenharia
 
989

Length

Max length21
Median length19
Mean length15.955658
Min length10

Characters and Unicode

Total characters735492
Distinct characters25
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTécnico Subsequente
2nd rowTécnico Integrado
3rd rowTécnico Subsequente
4th rowTécnico Subsequente
5th rowTécnico Integrado

Common Values

ValueCountFrequency (%)
Técnico Integrado 19718
42.8%
Técnico Subsequente 13319
28.9%
Tecnologia 6781
 
14.7%
Licenciatura 4616
 
10.0%
Engenharia 989
 
2.1%
Técnico Integrado EJA 673
 
1.5%

Length

2024-11-22T15:49:57.399580image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:57.525814image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
técnico 33710
41.9%
integrado 20391
25.3%
subsequente 13319
 
16.5%
tecnologia 6781
 
8.4%
licenciatura 4616
 
5.7%
engenharia 989
 
1.2%
eja 673
 
0.8%

Most occurring characters

ValueCountFrequency (%)
c 83433
11.3%
n 80795
11.0%
e 72734
 
9.9%
o 67663
 
9.2%
i 50712
 
6.9%
T 40491
 
5.5%
a 38382
 
5.2%
t 38326
 
5.2%
34383
 
4.7%
é 33710
 
4.6%
Other values (15) 194863
26.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 619284
84.2%
Uppercase Letter 81825
 
11.1%
Space Separator 34383
 
4.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 83433
13.5%
n 80795
13.0%
e 72734
11.7%
o 67663
10.9%
i 50712
8.2%
a 38382
6.2%
t 38326
6.2%
é 33710
 
5.4%
u 31254
 
5.0%
g 28161
 
4.5%
Other values (7) 94114
15.2%
Uppercase Letter
ValueCountFrequency (%)
T 40491
49.5%
I 20391
24.9%
S 13319
 
16.3%
L 4616
 
5.6%
E 1662
 
2.0%
J 673
 
0.8%
A 673
 
0.8%
Space Separator
ValueCountFrequency (%)
34383
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 701109
95.3%
Common 34383
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 83433
11.9%
n 80795
11.5%
e 72734
10.4%
o 67663
9.7%
i 50712
 
7.2%
T 40491
 
5.8%
a 38382
 
5.5%
t 38326
 
5.5%
é 33710
 
4.8%
u 31254
 
4.5%
Other values (14) 163609
23.3%
Common
ValueCountFrequency (%)
34383
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 701782
95.4%
None 33710
 
4.6%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 83433
11.9%
n 80795
11.5%
e 72734
10.4%
o 67663
9.6%
i 50712
 
7.2%
T 40491
 
5.8%
a 38382
 
5.5%
t 38326
 
5.5%
34383
 
4.9%
u 31254
 
4.5%
Other values (14) 163609
23.3%
None
ValueCountFrequency (%)
é 33710
100.0%

Nível de Ensino
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.8 MiB
Médio
33710 
Graduação
12386 

Length

Max length9
Median length5
Mean length6.0748004
Min length5

Characters and Unicode

Total characters280024
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMédio
2nd rowMédio
3rd rowMédio
4th rowMédio
5th rowMédio

Common Values

ValueCountFrequency (%)
Médio 33710
73.1%
Graduação 12386
 
26.9%

Length

2024-11-22T15:49:57.668164image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:57.770865image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
médio 33710
73.1%
graduação 12386
 
26.9%

Most occurring characters

ValueCountFrequency (%)
d 46096
16.5%
o 46096
16.5%
M 33710
12.0%
é 33710
12.0%
i 33710
12.0%
a 24772
8.8%
G 12386
 
4.4%
r 12386
 
4.4%
u 12386
 
4.4%
ç 12386
 
4.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 233928
83.5%
Uppercase Letter 46096
 
16.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 46096
19.7%
o 46096
19.7%
é 33710
14.4%
i 33710
14.4%
a 24772
10.6%
r 12386
 
5.3%
u 12386
 
5.3%
ç 12386
 
5.3%
ã 12386
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
M 33710
73.1%
G 12386
 
26.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 280024
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 46096
16.5%
o 46096
16.5%
M 33710
12.0%
é 33710
12.0%
i 33710
12.0%
a 24772
8.8%
G 12386
 
4.4%
r 12386
 
4.4%
u 12386
 
4.4%
ç 12386
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 221542
79.1%
None 58482
 
20.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 46096
20.8%
o 46096
20.8%
M 33710
15.2%
i 33710
15.2%
a 24772
11.2%
G 12386
 
5.6%
r 12386
 
5.6%
u 12386
 
5.6%
None
ValueCountFrequency (%)
é 33710
57.6%
ç 12386
 
21.2%
ã 12386
 
21.2%

Percentual de Progresso
Real number (ℝ)

HIGH CORRELATION  MISSING  ZEROS 

Distinct3880
Distinct (%)9.7%
Missing6063
Missing (%)13.2%
Infinite0
Infinite (%)0.0%
Mean42.263456
Minimum0
Maximum100
Zeros4441
Zeros (%)9.6%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:57.876301image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q110.92
median32.94
Q370.28
95-th percentile100
Maximum100
Range100
Interquartile range (IQR)59.36

Descriptive statistics

Standard deviation36.19712
Coefficient of variation (CV)0.8564638
Kurtosis-1.225129
Mean42.263456
Median Absolute Deviation (MAD)29.7
Skewness0.46823003
Sum1691932.9
Variance1310.2315
MonotonicityNot monotonic
2024-11-22T15:49:58.058484image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
100 7429
 
16.1%
0 4441
 
9.6%
44.88 569
 
1.2%
22.52 480
 
1.0%
44.8 449
 
1.0%
22.2 447
 
1.0%
0.12 387
 
0.8%
69.31 338
 
0.7%
0.6 335
 
0.7%
22 260
 
0.6%
Other values (3870) 24898
54.0%
(Missing) 6063
 
13.2%
ValueCountFrequency (%)
0 4441
9.6%
0.07 1
 
< 0.1%
0.09 18
 
< 0.1%
0.1 18
 
< 0.1%
0.11 28
 
0.1%
0.12 387
 
0.8%
0.13 43
 
0.1%
0.14 10
 
< 0.1%
0.15 80
 
0.2%
0.16 20
 
< 0.1%
ValueCountFrequency (%)
100 7429
16.1%
99.27 1
 
< 0.1%
99.13 1
 
< 0.1%
99.11 1
 
< 0.1%
99.09 1
 
< 0.1%
99 6
 
< 0.1%
98.94 2
 
< 0.1%
98.9 3
 
< 0.1%
98.88 1
 
< 0.1%
98.85 1
 
< 0.1%

Período Atual
Real number (ℝ)

HIGH CORRELATION 

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.4550937
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:58.159852image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q34
95-th percentile4
Maximum10
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.4372136
Coefficient of variation (CV)0.5854007
Kurtosis0.6706711
Mean2.4550937
Median Absolute Deviation (MAD)1
Skewness0.8388074
Sum113170
Variance2.0655829
MonotonicityNot monotonic
2024-11-22T15:49:58.243606image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1 16795
36.4%
4 11426
24.8%
2 8956
19.4%
3 7049
15.3%
6 872
 
1.9%
5 483
 
1.0%
8 315
 
0.7%
7 183
 
0.4%
10 11
 
< 0.1%
9 6
 
< 0.1%
ValueCountFrequency (%)
1 16795
36.4%
2 8956
19.4%
3 7049
15.3%
4 11426
24.8%
5 483
 
1.0%
6 872
 
1.9%
7 183
 
0.4%
8 315
 
0.7%
9 6
 
< 0.1%
10 11
 
< 0.1%
ValueCountFrequency (%)
10 11
 
< 0.1%
9 6
 
< 0.1%
8 315
 
0.7%
7 183
 
0.4%
6 872
 
1.9%
5 483
 
1.0%
4 11426
24.8%
3 7049
15.3%
2 8956
19.4%
1 16795
36.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
1
37494 
2
8602 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters46096
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Length

2024-11-22T15:49:58.333398image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:58.407315image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Most occurring characters

ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46096
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Most occurring scripts

ValueCountFrequency (%)
Common 46096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 37494
81.3%
2 8602
 
18.7%

Renda Per Capita
Real number (ℝ)

MISSING  ZEROS 

Distinct533
Distinct (%)1.2%
Missing2228
Missing (%)4.8%
Infinite0
Infinite (%)0.0%
Mean0.67357641
Minimum0
Maximum10
Zeros2552
Zeros (%)5.5%
Negative0
Negative (%)0.0%
Memory size720.2 KiB
2024-11-22T15:49:58.493768image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10.22
median0.39
Q30.72
95-th percentile2.05
Maximum10
Range10
Interquartile range (IQR)0.5

Descriptive statistics

Standard deviation1.1060129
Coefficient of variation (CV)1.6420007
Kurtosis41.432219
Mean0.67357641
Median Absolute Deviation (MAD)0.21
Skewness5.7666762
Sum29548.45
Variance1.2232646
MonotonicityNot monotonic
2024-11-22T15:49:58.622000image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 2552
 
5.5%
0.25 1840
 
4.0%
0.33 1685
 
3.7%
0.5 1504
 
3.3%
0.17 1162
 
2.5%
0.36 939
 
2.0%
0.41 798
 
1.7%
0.21 788
 
1.7%
0.2 779
 
1.7%
0.27 773
 
1.7%
Other values (523) 31048
67.4%
(Missing) 2228
 
4.8%
ValueCountFrequency (%)
0 2552
5.5%
0.01 45
 
0.1%
0.02 76
 
0.2%
0.03 137
 
0.3%
0.04 155
 
0.3%
0.05 187
 
0.4%
0.06 264
 
0.6%
0.07 303
 
0.7%
0.08 534
 
1.2%
0.09 269
 
0.6%
ValueCountFrequency (%)
10 329
0.7%
9.9 2
 
< 0.1%
9.87 1
 
< 0.1%
9.78 2
 
< 0.1%
9.38 1
 
< 0.1%
9.28 1
 
< 0.1%
9.27 1
 
< 0.1%
9.12 2
 
< 0.1%
9.09 4
 
< 0.1%
9.08 1
 
< 0.1%

Sexo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
M
24062 
F
22034 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters46096
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowM
5th rowF

Common Values

ValueCountFrequency (%)
M 24062
52.2%
F 22034
47.8%

Length

2024-11-22T15:49:58.732610image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:58.796726image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
m 24062
52.2%
f 22034
47.8%

Most occurring characters

ValueCountFrequency (%)
M 24062
52.2%
F 22034
47.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 46096
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 24062
52.2%
F 22034
47.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 46096
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 24062
52.2%
F 22034
47.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 24062
52.2%
F 22034
47.8%
Distinct2
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size3.7 MiB
Pública
35173 
Privada
10916 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters322623
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPública
2nd rowPrivada
3rd rowPública
4th rowPública
5th rowPública

Common Values

ValueCountFrequency (%)
Pública 35173
76.3%
Privada 10916
 
23.7%
(Missing) 7
 
< 0.1%

Length

2024-11-22T15:49:58.876853image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:58.961040image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
pública 35173
76.3%
privada 10916
 
23.7%

Most occurring characters

ValueCountFrequency (%)
a 57005
17.7%
P 46089
14.3%
i 46089
14.3%
ú 35173
10.9%
b 35173
10.9%
l 35173
10.9%
c 35173
10.9%
r 10916
 
3.4%
v 10916
 
3.4%
d 10916
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 276534
85.7%
Uppercase Letter 46089
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 57005
20.6%
i 46089
16.7%
ú 35173
12.7%
b 35173
12.7%
l 35173
12.7%
c 35173
12.7%
r 10916
 
3.9%
v 10916
 
3.9%
d 10916
 
3.9%
Uppercase Letter
ValueCountFrequency (%)
P 46089
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 322623
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 57005
17.7%
P 46089
14.3%
i 46089
14.3%
ú 35173
10.9%
b 35173
10.9%
l 35173
10.9%
c 35173
10.9%
r 10916
 
3.4%
v 10916
 
3.4%
d 10916
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 287450
89.1%
None 35173
 
10.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 57005
19.8%
P 46089
16.0%
i 46089
16.0%
b 35173
12.2%
l 35173
12.2%
c 35173
12.2%
r 10916
 
3.8%
v 10916
 
3.8%
d 10916
 
3.8%
None
ValueCountFrequency (%)
ú 35173
100.0%

Turno
Categorical

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.9 MiB
Vespertino
17784 
Matutino
14571 
Noturno
10618 
EAD
2012 
Integral
 
991

Length

Max length10
Median length8
Mean length8.317815
Min length3

Characters and Unicode

Total characters383418
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMatutino
2nd rowVespertino
3rd rowNoturno
4th rowVespertino
5th rowVespertino

Common Values

ValueCountFrequency (%)
Vespertino 17784
38.6%
Matutino 14571
31.6%
Noturno 10618
23.0%
EAD 2012
 
4.4%
Integral 991
 
2.1%
Diurno 120
 
0.3%

Length

2024-11-22T15:49:59.063122image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:59.154673image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
vespertino 17784
38.6%
matutino 14571
31.6%
noturno 10618
23.0%
ead 2012
 
4.4%
integral 991
 
2.1%
diurno 120
 
0.3%

Most occurring characters

ValueCountFrequency (%)
t 58535
15.3%
o 53711
14.0%
n 44084
11.5%
e 36559
9.5%
i 32475
8.5%
r 29513
7.7%
u 25309
6.6%
V 17784
 
4.6%
s 17784
 
4.6%
p 17784
 
4.6%
Other values (9) 49880
13.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 333298
86.9%
Uppercase Letter 50120
 
13.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 58535
17.6%
o 53711
16.1%
n 44084
13.2%
e 36559
11.0%
i 32475
9.7%
r 29513
8.9%
u 25309
7.6%
s 17784
 
5.3%
p 17784
 
5.3%
a 15562
 
4.7%
Other values (2) 1982
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
V 17784
35.5%
M 14571
29.1%
N 10618
21.2%
D 2132
 
4.3%
E 2012
 
4.0%
A 2012
 
4.0%
I 991
 
2.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 383418
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 58535
15.3%
o 53711
14.0%
n 44084
11.5%
e 36559
9.5%
i 32475
8.5%
r 29513
7.7%
u 25309
6.6%
V 17784
 
4.6%
s 17784
 
4.6%
p 17784
 
4.6%
Other values (9) 49880
13.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 383418
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 58535
15.3%
o 53711
14.0%
n 44084
11.5%
e 36559
9.5%
i 32475
8.5%
r 29513
7.7%
u 25309
6.6%
V 17784
 
4.6%
s 17784
 
4.6%
p 17784
 
4.6%
Other values (9) 49880
13.0%

Zona Residencial
Categorical

Distinct2
Distinct (%)< 0.1%
Missing8
Missing (%)< 0.1%
Memory size2.8 MiB
Urbana
39469 
Rural
6619 

Length

Max length6
Median length6
Mean length5.8563834
Min length5

Characters and Unicode

Total characters269909
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUrbana
2nd rowUrbana
3rd rowUrbana
4th rowUrbana
5th rowUrbana

Common Values

ValueCountFrequency (%)
Urbana 39469
85.6%
Rural 6619
 
14.4%
(Missing) 8
 
< 0.1%

Length

2024-11-22T15:49:59.291422image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:59.392059image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
urbana 39469
85.6%
rural 6619
 
14.4%

Most occurring characters

ValueCountFrequency (%)
a 85557
31.7%
r 46088
17.1%
U 39469
14.6%
b 39469
14.6%
n 39469
14.6%
R 6619
 
2.5%
u 6619
 
2.5%
l 6619
 
2.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 223821
82.9%
Uppercase Letter 46088
 
17.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 85557
38.2%
r 46088
20.6%
b 39469
17.6%
n 39469
17.6%
u 6619
 
3.0%
l 6619
 
3.0%
Uppercase Letter
ValueCountFrequency (%)
U 39469
85.6%
R 6619
 
14.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 269909
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 85557
31.7%
r 46088
17.1%
U 39469
14.6%
b 39469
14.6%
n 39469
14.6%
R 6619
 
2.5%
u 6619
 
2.5%
l 6619
 
2.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 269909
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 85557
31.7%
r 46088
17.1%
U 39469
14.6%
b 39469
14.6%
n 39469
14.6%
R 6619
 
2.5%
u 6619
 
2.5%
l 6619
 
2.5%

Prática Profissional Pendente
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing6967
Missing (%)15.1%
Memory size2.8 MiB
Sim
30958 
Não
8171 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters117387
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNão
2nd rowSim
3rd rowSim
4th rowNão
5th rowNão

Common Values

ValueCountFrequency (%)
Sim 30958
67.2%
Não 8171
 
17.7%
(Missing) 6967
 
15.1%

Length

2024-11-22T15:49:59.477225image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:59.558600image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
sim 30958
79.1%
não 8171
 
20.9%

Most occurring characters

ValueCountFrequency (%)
S 30958
26.4%
i 30958
26.4%
m 30958
26.4%
N 8171
 
7.0%
ã 8171
 
7.0%
o 8171
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78258
66.7%
Uppercase Letter 39129
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 30958
39.6%
m 30958
39.6%
ã 8171
 
10.4%
o 8171
 
10.4%
Uppercase Letter
ValueCountFrequency (%)
S 30958
79.1%
N 8171
 
20.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 117387
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 30958
26.4%
i 30958
26.4%
m 30958
26.4%
N 8171
 
7.0%
ã 8171
 
7.0%
o 8171
 
7.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109216
93.0%
None 8171
 
7.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 30958
28.3%
i 30958
28.3%
m 30958
28.3%
N 8171
 
7.5%
o 8171
 
7.5%
None
ValueCountFrequency (%)
ã 8171
100.0%

Carga Horária de Prática Profissional Pendente
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing6967
Missing (%)15.1%
Memory size2.8 MiB
Sim
31219 
Não
7910 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters117387
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNão
2nd rowSim
3rd rowSim
4th rowNão
5th rowNão

Common Values

ValueCountFrequency (%)
Sim 31219
67.7%
Não 7910
 
17.2%
(Missing) 6967
 
15.1%

Length

2024-11-22T15:49:59.649709image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:59.742132image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
sim 31219
79.8%
não 7910
 
20.2%

Most occurring characters

ValueCountFrequency (%)
S 31219
26.6%
i 31219
26.6%
m 31219
26.6%
N 7910
 
6.7%
ã 7910
 
6.7%
o 7910
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78258
66.7%
Uppercase Letter 39129
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 31219
39.9%
m 31219
39.9%
ã 7910
 
10.1%
o 7910
 
10.1%
Uppercase Letter
ValueCountFrequency (%)
S 31219
79.8%
N 7910
 
20.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 117387
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 31219
26.6%
i 31219
26.6%
m 31219
26.6%
N 7910
 
6.7%
ã 7910
 
6.7%
o 7910
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 109477
93.3%
None 7910
 
6.7%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 31219
28.5%
i 31219
28.5%
m 31219
28.5%
N 7910
 
7.2%
o 7910
 
7.2%
None
ValueCountFrequency (%)
ã 7910
100.0%

Registro de TCC Pendente
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing6969
Missing (%)15.1%
Memory size3.3 MiB
Não
29272 
Sim
9855 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters117381
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNão
2nd rowNão
3rd rowNão
4th rowNão
5th rowNão

Common Values

ValueCountFrequency (%)
Não 29272
63.5%
Sim 9855
 
21.4%
(Missing) 6969
 
15.1%

Length

2024-11-22T15:49:59.837156image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:49:59.915837image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
não 29272
74.8%
sim 9855
 
25.2%

Most occurring characters

ValueCountFrequency (%)
N 29272
24.9%
ã 29272
24.9%
o 29272
24.9%
S 9855
 
8.4%
i 9855
 
8.4%
m 9855
 
8.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78254
66.7%
Uppercase Letter 39127
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
ã 29272
37.4%
o 29272
37.4%
i 9855
 
12.6%
m 9855
 
12.6%
Uppercase Letter
ValueCountFrequency (%)
N 29272
74.8%
S 9855
 
25.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 117381
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 29272
24.9%
ã 29272
24.9%
o 29272
24.9%
S 9855
 
8.4%
i 9855
 
8.4%
m 9855
 
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 88109
75.1%
None 29272
 
24.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 29272
33.2%
o 29272
33.2%
S 9855
 
11.2%
i 9855
 
11.2%
m 9855
 
11.2%
None
ValueCountFrequency (%)
ã 29272
100.0%

Carga Horária de Seminário Pendente
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing7236
Missing (%)15.7%
Memory size2.9 MiB
Sim
28295 
Não
10565 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters116580
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNão
2nd rowSim
3rd rowSim
4th rowNão
5th rowNão

Common Values

ValueCountFrequency (%)
Sim 28295
61.4%
Não 10565
 
22.9%
(Missing) 7236
 
15.7%

Length

2024-11-22T15:50:00.026289image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:50:00.105490image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
sim 28295
72.8%
não 10565
 
27.2%

Most occurring characters

ValueCountFrequency (%)
S 28295
24.3%
i 28295
24.3%
m 28295
24.3%
N 10565
 
9.1%
ã 10565
 
9.1%
o 10565
 
9.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 77720
66.7%
Uppercase Letter 38860
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 28295
36.4%
m 28295
36.4%
ã 10565
 
13.6%
o 10565
 
13.6%
Uppercase Letter
ValueCountFrequency (%)
S 28295
72.8%
N 10565
 
27.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 116580
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 28295
24.3%
i 28295
24.3%
m 28295
24.3%
N 10565
 
9.1%
ã 10565
 
9.1%
o 10565
 
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 106015
90.9%
None 10565
 
9.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 28295
26.7%
i 28295
26.7%
m 28295
26.7%
N 10565
 
10.0%
o 10565
 
10.0%
None
ValueCountFrequency (%)
ã 10565
100.0%

Carga Horária Obrigatória Pendente
Categorical

HIGH CORRELATION  MISSING 

Distinct2
Distinct (%)< 0.1%
Missing6967
Missing (%)15.1%
Memory size2.8 MiB
Sim
30546 
Não
8583 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters117387
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNão
2nd rowSim
3rd rowSim
4th rowNão
5th rowNão

Common Values

ValueCountFrequency (%)
Sim 30546
66.3%
Não 8583
 
18.6%
(Missing) 6967
 
15.1%

Length

2024-11-22T15:50:00.183691image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:50:00.261958image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
sim 30546
78.1%
não 8583
 
21.9%

Most occurring characters

ValueCountFrequency (%)
S 30546
26.0%
i 30546
26.0%
m 30546
26.0%
N 8583
 
7.3%
ã 8583
 
7.3%
o 8583
 
7.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 78258
66.7%
Uppercase Letter 39129
33.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 30546
39.0%
m 30546
39.0%
ã 8583
 
11.0%
o 8583
 
11.0%
Uppercase Letter
ValueCountFrequency (%)
S 30546
78.1%
N 8583
 
21.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 117387
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 30546
26.0%
i 30546
26.0%
m 30546
26.0%
N 8583
 
7.3%
ã 8583
 
7.3%
o 8583
 
7.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 108804
92.7%
None 8583
 
7.3%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 30546
28.1%
i 30546
28.1%
m 30546
28.1%
N 8583
 
7.9%
o 8583
 
7.9%
None
ValueCountFrequency (%)
ã 8583
100.0%

curso
Text

Distinct78
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size5.5 MiB
2024-11-22T15:50:00.465476image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Length

Max length90
Median length47
Mean length26.894806
Min length16

Characters and Unicode

Total characters1239743
Distinct characters56
Distinct categories6 ?
Distinct scripts3 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowTécnico em Agropecuária
2nd rowTécnico em Informática
3rd rowTécnico em Manutenção e Suporte em Informática
4th rowTécnico em Química
5th rowTécnico em Biocombustíveis
ValueCountFrequency (%)
em 46826
26.8%
técnico 33710
19.3%
informática 9065
 
5.2%
tecnologia 6781
 
3.9%
licenciatura 4616
 
2.6%
e 4097
 
2.3%
de 4051
 
2.3%
edificações 3315
 
1.9%
para 2948
 
1.7%
internet 2679
 
1.5%
Other values (100) 56730
32.5%
2024-11-22T15:50:00.883061image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
128722
 
10.4%
e 119128
 
9.6%
c 116795
 
9.4%
i 106152
 
8.6%
o 96951
 
7.8%
n 87291
 
7.0%
a 77644
 
6.3%
m 74204
 
6.0%
t 50585
 
4.1%
r 46192
 
3.7%
Other values (46) 336079
27.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 996307
80.4%
Space Separator 128722
 
10.4%
Uppercase Letter 114321
 
9.2%
Dash Punctuation 203
 
< 0.1%
Nonspacing Mark 95
 
< 0.1%
Other Punctuation 95
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 119128
12.0%
c 116795
11.7%
i 106152
10.7%
o 96951
9.7%
n 87291
8.8%
a 77644
7.8%
m 74204
7.4%
t 50585
 
5.1%
r 46192
 
4.6%
é 37060
 
3.7%
Other values (24) 184305
18.5%
Uppercase Letter
ValueCountFrequency (%)
T 42266
37.0%
I 12248
 
10.7%
E 10418
 
9.1%
M 8835
 
7.7%
A 8362
 
7.3%
L 6136
 
5.4%
S 5417
 
4.7%
G 4023
 
3.5%
C 3771
 
3.3%
P 3170
 
2.8%
Other values (8) 9675
 
8.5%
Space Separator
ValueCountFrequency (%)
128722
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 203
100.0%
Nonspacing Mark
ValueCountFrequency (%)
̂ 95
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 95
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1110628
89.6%
Common 129020
 
10.4%
Inherited 95
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 119128
10.7%
c 116795
 
10.5%
i 106152
 
9.6%
o 96951
 
8.7%
n 87291
 
7.9%
a 77644
 
7.0%
m 74204
 
6.7%
t 50585
 
4.6%
r 46192
 
4.2%
T 42266
 
3.8%
Other values (42) 293420
26.4%
Common
ValueCountFrequency (%)
128722
99.8%
- 203
 
0.2%
/ 95
 
0.1%
Inherited
ValueCountFrequency (%)
̂ 95
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1156294
93.3%
None 83354
 
6.7%
Diacriticals 95
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
128722
11.1%
e 119128
10.3%
c 116795
10.1%
i 106152
9.2%
o 96951
 
8.4%
n 87291
 
7.5%
a 77644
 
6.7%
m 74204
 
6.4%
t 50585
 
4.4%
r 46192
 
4.0%
Other values (34) 252630
21.8%
None
ValueCountFrequency (%)
é 37060
44.5%
á 13214
 
15.9%
ç 11333
 
13.6%
ã 9192
 
11.0%
í 4128
 
5.0%
õ 3315
 
4.0%
â 1984
 
2.4%
ô 1318
 
1.6%
ó 896
 
1.1%
ú 670
 
0.8%
Diacriticals
ValueCountFrequency (%)
̂ 95
100.0%

class
Categorical

HIGH CORRELATION 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.5 MiB
0
31064 
1
15032 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters46096
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Length

2024-11-22T15:50:01.050867image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-22T15:50:01.141893image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Most occurring characters

ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 46096
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Most occurring scripts

ValueCountFrequency (%)
Common 46096
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 46096
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 31064
67.4%
1 15032
32.6%

Interactions

2024-11-22T15:49:51.266905image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.272314image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.294162image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:45.720751image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.805304image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.028248image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.138928image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.542161image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.361499image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.392363image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.387902image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:45.985835image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.913310image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.257956image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.318294image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.669350image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.436817image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.506743image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.521539image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.196409image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.124622image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.368797image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.548776image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.768238image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.541107image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.801533image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.676935image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.331995image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.292555image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.461125image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.816258image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.853274image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.647455image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.902929image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.997102image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.439192image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.480769image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.572360image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.958794image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.961534image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.751170image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:43.996902image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:45.284938image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.521906image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.626262image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.670133image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.077596image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.050100image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.885664image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.091011image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:45.472158image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.621176image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.758514image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:48.811851image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.270345image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.129866image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.988037image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:44.184668image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:45.601724image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:46.702327image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:47.903337image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:49.033537image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:50.372427image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
2024-11-22T15:49:51.188745image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/

Correlations

2024-11-22T15:50:01.523636image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Ano Letivo de Previsão de ConclusãoAno de IngressoCampusCarga Horária Obrigatória PendenteCarga Horária de Prática Profissional PendenteCarga Horária de Seminário PendenteCódigo CursoEstado CivilEtnia/RaçaForma de IngressoFrequência no PeríodoI.R.A.ModalidadeNível de EnsinoPercentual de ProgressoPeríodo AtualPeríodo de IngressoPrática Profissional PendenteRegistro de TCC PendenteRenda Per CapitaSexoTipo de Escola de OrigemTurnoZona Residencialclass
Ano Letivo de Previsão de Conclusão1.0000.8930.0830.5320.4960.4600.0520.0460.0450.0980.181-0.2350.3210.200-0.339-0.4120.2160.4850.3230.0270.0400.0590.2460.0730.334
Ano de Ingresso0.8931.0000.0720.5120.4980.4950.0320.0430.0510.0550.134-0.2940.0630.054-0.375-0.4710.1410.4930.2410.0260.0470.0390.0880.0700.307
Campus0.0830.0721.0000.0890.1040.1480.9480.1020.0930.0730.0770.0990.2740.2690.0950.0810.2890.1060.2540.0530.1300.1520.4730.2690.137
Carga Horária Obrigatória Pendente0.5320.5120.0891.0000.8930.8300.0750.0490.0490.1110.3640.3920.1980.1820.9420.6720.0260.8780.2620.0130.0550.0650.1150.0340.339
Carga Horária de Prática Profissional Pendente0.4960.4980.1040.8931.0000.7920.0880.0550.0500.1100.3780.3860.2090.1810.9720.6360.0510.9640.2640.0150.0570.0690.1280.0340.347
Carga Horária de Seminário Pendente0.4600.4950.1480.8300.7921.0000.1170.0780.0550.1610.3670.4140.2880.2600.8940.7730.0710.7900.3210.0140.0650.0750.1750.0250.358
Código Curso0.0520.0320.9480.0750.0880.1171.0000.0860.0650.0840.048-0.0110.2280.1970.025-0.0050.2280.0870.177-0.1070.1150.1070.3290.2070.116
Estado Civil0.0460.0430.1020.0490.0550.0780.0861.0000.0110.0450.0620.0430.1270.1190.0820.0460.1590.0550.1080.0410.0650.0380.1300.0580.102
Etnia/Raça0.0450.0510.0930.0490.0500.0550.0650.0111.0000.2220.0320.0550.0500.0690.0410.0260.0410.0490.0620.0650.0290.2380.0320.1060.046
Forma de Ingresso0.0980.0550.0730.1110.1100.1610.0840.0450.2221.0000.0550.0740.2730.5590.0750.0940.1250.1080.5280.0690.0700.4780.1560.1400.126
Frequência no Período0.1810.1340.0770.3640.3780.3670.0480.0620.0320.0551.0000.1770.2130.2230.3850.1780.2080.3790.2500.0430.1020.1010.1310.0340.715
I.R.A.-0.235-0.2940.0990.3920.3860.414-0.0110.0430.0550.0740.1771.0000.1240.1550.6770.7380.1110.3880.2610.0780.1060.1540.1030.0590.416
Modalidade0.3210.0630.2740.1980.2090.2880.2280.1270.0500.2730.2130.1241.0001.0000.2400.2640.4920.2070.9480.0410.1080.1900.4950.0740.432
Nível de Ensino0.2000.0540.2690.1820.1810.2600.1970.1190.0690.5590.2230.1551.0001.0000.3400.4360.0100.1780.9470.0350.0640.0550.3410.0230.170
Percentual de Progresso-0.339-0.3750.0950.9420.9720.8940.0250.0820.0410.0750.3850.6770.2400.3401.0000.9190.2170.9450.3970.0280.1220.1430.1550.0670.600
Período Atual-0.412-0.4710.0810.6720.6360.773-0.0050.0460.0260.0940.1780.7380.2640.4360.9191.0000.1000.6440.4660.0190.0770.0920.1840.0480.409
Período de Ingresso0.2160.1410.2890.0260.0510.0710.2280.1590.0410.1250.2080.1110.4920.0100.2170.1001.0000.0520.0260.0100.0510.0650.3720.0520.195
Prática Profissional Pendente0.4850.4930.1060.8780.9640.7900.0870.0550.0490.1080.3790.3880.2070.1780.9450.6440.0521.0000.2600.0170.0590.0690.1260.0350.350
Registro de TCC Pendente0.3230.2410.2540.2620.2640.3210.1770.1080.0620.5280.2500.2610.9480.9470.3970.4660.0260.2601.0000.0320.0640.0600.3380.0110.183
Renda Per Capita0.0270.0260.0530.0130.0150.014-0.1070.0410.0650.0690.0430.0780.0410.0350.0280.0190.0100.0170.0321.0000.0540.2890.0420.1190.042
Sexo0.0400.0470.1300.0550.0570.0650.1150.0650.0290.0700.1020.1060.1080.0640.1220.0770.0510.0590.0640.0541.0000.0210.0930.0560.082
Tipo de Escola de Origem0.0590.0390.1520.0650.0690.0750.1070.0380.2380.4780.1010.1540.1900.0550.1430.0920.0650.0690.0600.2890.0211.0000.1200.1710.080
Turno0.2460.0880.4730.1150.1280.1750.3290.1300.0320.1560.1310.1030.4950.3410.1550.1840.3720.1260.3380.0420.0930.1201.0000.0530.247
Zona Residencial0.0730.0700.2690.0340.0340.0250.2070.0580.1060.1400.0340.0590.0740.0230.0670.0480.0520.0350.0110.1190.0560.1710.0531.0000.038
class0.3340.3070.1370.3390.3470.3580.1160.1020.0460.1260.7150.4160.4320.1700.6000.4090.1950.3500.1830.0420.0820.0800.2470.0381.000

Missing values

2024-11-22T15:49:52.210775image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-22T15:49:52.719630image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2024-11-22T15:49:53.153112image/svg+xmlMatplotlib v3.9.1, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.

Sample

Ano Letivo de Previsão de ConclusãoAno de IngressoCampusCódigo CursoDescrição do CursoEstado CivilEtnia/RaçaForma de IngressoFrequência no PeríodoI.R.A.MatrizModalidadeNível de EnsinoPercentual de ProgressoPeríodo AtualPeríodo de IngressoRenda Per CapitaSexoTipo de Escola de OrigemTurnoZona ResidencialPrática Profissional PendenteCarga Horária de Prática Profissional PendenteRegistro de TCC PendenteCarga Horária de Seminário PendenteCarga Horária Obrigatória Pendentecursoclass
020242023AP8079Técnico de Nivel Médio em Agropecuária, na Forma Subsequente [2012] - Campus ApodiSolteiroPardaAmpla Concorrência81.530.00136 - Técnico Subsequente em Agropecuária (2012)Técnico SubsequenteMédioNaN110.17MPúblicaMatutinoUrbanaNaNNaNNaNNaNNaNTécnico em Agropecuária0
120212018AP8401Técnico de Nivel Médio em Informática, na Forma Integrado (2012) - Campus ApodiSolteiroBrancaAmpla Concorrência99.3683.36106 - Técnico Integrado em Informática (2012)Técnico IntegradoMédio100.00410.61MPrivadaVespertinoUrbanaNãoNãoNãoNãoNãoTécnico em Informática0
220202018AP8407Técnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2014] - Campus ApodiSolteiroBrancaAmpla Concorrência0.0032.30148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)Técnico SubsequenteMédio0.61110.26MPúblicaNoturnoUrbanaSimSimNãoSimSimTécnico em Manutenção e Suporte em Informática1
320222020AP8428Técnico de Nível Médio em Química, na Forma Subsequente (2015) - Campus ApodiCasadoPardaAmpla Concorrência0.0090.86152 - Técnico Subsequente em Química (2012)Técnico SubsequenteMédio6.53120.32MPúblicaVespertinoUrbanaSimSimNãoSimSimTécnico em Química1
420222019AP8427Técnico de Nível Médio em Biocombustíveis, na Forma Integrado (2012) - Campus ApodiSolteiroPardaL6 - Qualquer Renda / Autodeclarado PPI95.4582.70180 - Técnico Integrado em Biocombustíveis (2012)Técnico IntegradoMédio100.00411.32FPúblicaVespertinoUrbanaNãoNãoNãoNãoNãoTécnico em Biocombustíveis0
520212018AP8026Técnico de Nivel Médio em Agropecuária, na Forma Integrada (2015) - Campus ApodiSolteiroBrancaAmpla Concorrência99.8387.61178 - Técnico Integrado em Agropecuária (2012)Técnico IntegradoMédio100.00410.34FPrivadaVespertinoUrbanaNãoNãoNãoNãoNãoTécnico em Agropecuária0
620212018AP8026Técnico de Nivel Médio em Agropecuária, na Forma Integrada (2015) - Campus ApodiCasadoBrancaAmpla Concorrência93.8473.21178 - Técnico Integrado em Agropecuária (2012)Técnico IntegradoMédio100.00410.46FPúblicaVespertinoUrbanaNãoNãoNãoNãoNãoTécnico em Agropecuária0
720222019AP8413Licenciatura em Química (2012) - Campus ApodiSolteiroBrancaL5 - Qualquer Renda / Qualquer Etnia (SISU)90.1269.80327 - Licenciatura em Química [2018]LicenciaturaGraduação34.78210.17MPúblicaVespertinoUrbanaSimSimSimSimSimLicenciatura em Química0
820232019AP8413Licenciatura em Química (2012) - Campus ApodiSolteiroPardaTransferência Facultativa0.0068.31327 - Licenciatura em Química [2018]LicenciaturaGraduação14.81222.00MPúblicaNoturnoUrbanaSimSimSimSimSimLicenciatura em Química1
920222018AP8413Licenciatura em Química (2012) - Campus ApodiSolteiroPretaL2 - Renda <= 1,5 / Autodeclarados PPI100.000.00151 - Licenciatura em Química (2012)LicenciaturaGraduação0.00110.50MPúblicaNoturnoUrbanaNaNNaNNaNNaNNaNLicenciatura em Química1
Ano Letivo de Previsão de ConclusãoAno de IngressoCampusCódigo CursoDescrição do CursoEstado CivilEtnia/RaçaForma de IngressoFrequência no PeríodoI.R.A.MatrizModalidadeNível de EnsinoPercentual de ProgressoPeríodo AtualPeríodo de IngressoRenda Per CapitaSexoTipo de Escola de OrigemTurnoZona ResidencialPrática Profissional PendenteCarga Horária de Prática Profissional PendenteRegistro de TCC PendenteCarga Horária de Seminário PendenteCarga Horária Obrigatória Pendentecursoclass
4668620222020ZN4069Tecnologia em Marketing (2015) - Campus Zona NorteSolteiroPretaL2 - Renda <= 1,5 / Autodeclarados PPI (SISU)0.000.00365 - Tecnologia em Marketing [2019]TecnologiaGraduação0.0011NaNMPúblicaNoturnoUrbanaSimSimSimSimSimTecnologia em Marketing1
4668720222019ZN4206Técnico de Nível Médio em Eletrônica, na Forma Integrado (2012) - Campus Zona NorteSolteiroPardaL6 - Qualquer Renda / Autodeclarado PPI93.2082.35201 - Técnico Integrado em Eletrônica (2012)Técnico IntegradoMédio100.00410.31MPúblicaMatutinoUrbanaNãoNãoNãoNãoNãoTécnico em Eletrônica0
4668820202018ZN4407Técnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2012] - Campus Zona NorteSolteiroBrancaAmpla Concorrência88.7888.67148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)Técnico SubsequenteMédio100.00411.31MPrivadaMatutinoUrbanaNãoNãoNãoNãoNãoTécnico em Manutenção e Suporte em Informática0
4668920232021ZN4407Técnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2012] - Campus Zona NorteSolteiroPretaAmpla Concorrência0.0046.39148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)Técnico SubsequenteMédio36.28320.55MPrivadaNoturnoUrbanaSimSimNãoSimSimTécnico em Manutenção e Suporte em Informática1
4669020242022ZN4407Técnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2012] - Campus Zona NorteSolteiroPardaAmpla Concorrência0.0090.14148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)Técnico SubsequenteMédio39.00320.41MPrivadaNoturnoUrbanaSimSimNãoSimSimTécnico em Manutenção e Suporte em Informática0
4669120262023ZN4206Técnico de Nível Médio em Eletrônica, na Forma Integrado (2012) - Campus Zona NorteSolteiroBrancaL1 - Renda <= 1,5 / Qualquer Etnia98.900.00201 - Técnico Integrado em Eletrônica (2012)Técnico IntegradoMédioNaN110.51MPúblicaVespertinoUrbanaNaNNaNNaNNaNNaNTécnico em Eletrônica0
4669220222020ZN4407Técnico de Nível Médio em Manutenção e Suporte em Informática, na Forma Subsequente [2012] - Campus Zona NorteSolteiroPardaAmpla Concorrência100.000.00148 - Técnico Subsequente em Manutenção e Suporte em Informatica (2012)Técnico SubsequenteMédioNaN120.78MPúblicaNoturnoUrbanaNaNNaNNaNNaNNaNTécnico em Manutenção e Suporte em Informática1
4669320222019ZN4111Técnico de Nível Médio em Informática para Internet, na Forma Integrado (2014) - Campus Zona NorteSolteiroBrancaAmpla Concorrência83.6279.93202 - Técnico Integrado em Informática para Internet (2014)Técnico IntegradoMédio100.00410.00MPrivadaVespertinoUrbanaNãoNãoNãoNãoNãoTécnico em Informática para Internet0
4669420222019ZN4206Técnico de Nível Médio em Eletrônica, na Forma Integrado (2012) - Campus Zona NorteSolteiroPardaL2 - Renda <= 1,5 / Autodeclarados PPI89.9279.47201 - Técnico Integrado em Eletrônica (2012)Técnico IntegradoMédio100.00410.12MPúblicaMatutinoUrbanaNãoNãoNãoNãoNãoTécnico em Eletrônica0
4669520222019ZN4069Tecnologia em Marketing (2015) - Campus Zona NorteSolteiroIndígenaL6 - Qualquer Renda / Autodeclarado PPI (SISU)0.0054.81365 - Tecnologia em Marketing [2019]TecnologiaGraduação42.00120.41FPúblicaNoturnoUrbanaSimSimSimSimSimTecnologia em Marketing0

Duplicate rows

Most frequently occurring

Ano Letivo de Previsão de ConclusãoAno de IngressoCampusCódigo CursoDescrição do CursoEstado CivilEtnia/RaçaForma de IngressoFrequência no PeríodoI.R.A.MatrizModalidadeNível de EnsinoPercentual de ProgressoPeríodo AtualPeríodo de IngressoRenda Per CapitaSexoTipo de Escola de OrigemTurnoZona ResidencialPrática Profissional PendenteCarga Horária de Prática Profissional PendenteRegistro de TCC PendenteCarga Horária de Seminário PendenteCarga Horária Obrigatória Pendentecursoclass# duplicates
11320252023ZL15806Tecnologia em Sistemas para Internet [2023] - Campus ZLSolteiroPardaAmpla Concorrência100.00.0448 - Tecnologia em Sistemas para Internet EAD [2023]TecnologiaGraduação0.011NaNMPúblicaEADUrbanaSimSimSimSimSimTecnologia em Sistemas para Internet09
14420262023ZL15430Licenciatura em Matemática (2023) - Campus Zona LesteSolteiroPardaAmpla Concorrência100.00.0457 - Licenciatura em matemática EaD [2023]LicenciaturaGraduação0.011NaNMPúblicaEADUrbanaSimSimSimSimSimLicenciatura em Matemática07
10820252023ZL15806Tecnologia em Sistemas para Internet [2023] - Campus ZLSolteiroBrancaAmpla Concorrência100.00.0448 - Tecnologia em Sistemas para Internet EAD [2023]TecnologiaGraduação0.011NaNMPúblicaEADUrbanaSimSimSimSimSimTecnologia em Sistemas para Internet05
10320252023CNAT1436Tecnologia em Gestão Pública (2012) - Natal-CentralSolteiroBrancaAmpla Concorrência100.00.0425 - Tecnologia em Gestão Pública [2022]TecnologiaGraduaçãoNaN12NaNFPrivadaVespertinoUrbanaNaNNaNNaNNaNNaNTecnologia em Gestão Pública04
12020262023CNAT1304Tecnologia em Gestão Ambiental (2012) - Campus Natal-CentralSolteiroBrancaAmpla Concorrência100.00.0394 - Tecnologia em Gestão Ambiental (2020)TecnologiaGraduação0.012NaNFPrivadaNoturnoUrbanaSimSimSimSimSimTecnologia em Gestão Ambiental04
12220262023CNAT1404Tecnologia em Análise e Desenvolvimento de Sistemas (2012) - Campus Natal-CentralSolteiroBrancaAmpla Concorrência100.00.091 - Tecnologia em Análise e Desenvolvimento de Sistemas (2012)TecnologiaGraduaçãoNaN12NaNMPrivadaVespertinoUrbanaNaNNaNNaNNaNNaNTecnologia em Análise e Desenvolvimento de Sistemas04
13420262023ZL15304Tecnologia em Gestão Ambiental EaD [2012] - Campus EaD (UAB)SolteiroBrancaAmpla Concorrência0.00.0134 - Tecnologia em Gestão Ambiental (EAD)TecnologiaGraduaçãoNaN12NaNFPúblicaEADUrbanaNaNNaNNaNNaNNaNTecnologia em Gestão Ambiental EaD04
13720262023ZL15304Tecnologia em Gestão Ambiental EaD [2012] - Campus EaD (UAB)SolteiroPardaAmpla Concorrência0.00.0134 - Tecnologia em Gestão Ambiental (EAD)TecnologiaGraduaçãoNaN12NaNMPúblicaEADUrbanaNaNNaNNaNNaNNaNTecnologia em Gestão Ambiental EaD04
13820262023ZL15430Licenciatura em Matemática (2023) - Campus Zona LesteCasadoBrancaAmpla Concorrência100.00.0457 - Licenciatura em matemática EaD [2023]LicenciaturaGraduação0.011NaNMPúblicaEADUrbanaSimSimSimSimSimLicenciatura em Matemática04
8420242022CA10116Técnico de Nível Médio em Vestuário, na Forma Subsequente [2012] - Campus CaicóSolteiroBrancaAmpla Concorrência0.00.0140 - Técnico Subsequente em Vestuário (2012)Técnico SubsequenteMédio0.012NaNFPúblicaNoturnoUrbanaSimSimNãoSimSimTécnico em Vestuário13